Web mining: extraction of information and knowledge discovery from the enterprise websites
نویسندگان
چکیده
Practical effective use of the enormous quantity of data available on the web is the focus for lots of researchers. Our article lays the framework for discovering possible profitable collaborative networks among firms via information available on the internet. This uncovered knowledge is the primary reason why companies attempt to co-operate. In order to provide this knowledge discovery, it is essential to identify each of the activity fields and skills or “savoir faire” of these business. Presented in this article is a Web Mining approach founded on an application for gathering and processing textual corpora. Its base is derived from the companies’ own websites. The aim of the work is to detect automatically the NAF 1 code (Nomenclature of French Activity) of an enterprise by exploring only its website. Then, similarity measures can be compared. Our developments are based on an original method. Evaluation tests have been done and are very encouraging.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملExpert Discovery: A web mining approach
Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...
متن کاملData Extraction using Content-Based Handles
In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...
متن کاملData Mining: A Novel Outlook to Explore Knowledge in Health and Medical Sciences
Today medical and Healthcare industry generate loads of diverse data about patients, disease diagnosis, prognosis, management, hospitals’ resources, electronic patient health records, medical devices and etc. Using the most efficient processing and analyzing method for knowledge extraction is a key point to cost-saving in clinical decision making. Data mining, sometimes called data or knowledge...
متن کاملIdentification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کامل